Returns and Risks: The Mean and The Median

Updated on: 2022-11-29

1 Introduction

In this project, my intention is to find out if using median returns in portfolio optimization could lead to better portfolio performance compared to mean returns. The use of mean returns might not accurately describe expected returns due to the non-normal distribution of financial returns. The mean is also known to be affected by outliers, whereas the median tend to be a more robust measure of central tendency.

I found that portfolios maximizing the median returns led to better diversification than maximizing the mean returns. Furthermore, minimizing the median absolute deviation resulted in a portfolio that has the highest return among portfolios minimizing the different risk measures. However, this project does not include an optimal portfolio selection which maximizes return for a given level of risk, or minimizes risk for a given level of return. The project does include a few other flaws, such as having only a small number of random portfolios generated for testing and not being able to show conclusively that the median measure is better than the mean.

2 Packages Required

library(doParallel) # For parallel computation in foreach loops
library(PortfolioAnalytics) # For portfolio optimization and analysis
library(RColorBrewer) # For color palettes in plots
library(tidyquant) # For quantmod and PerformanceAnalytics functions
library(tidyverse) # For dplyr and ggplot2 functions (data manipulation and plotting)

3 Retrieve Stock Prices and Calculate Returns

I retrieved the daily adjusted closing prices of 10 stocks from Yahoo Finance, starting January 2015 to June 2022. The daily returns are then calculated using the discrete/simple method to be used in calculating portfolio returns.

The 10 stocks used in this project are Procter & Gamble (PG), Walmart (WMT), Booking Holdings (BKNG), Salesforce (CRM), 3M (MMM), Starbucks (SBUX), Walt Disney (DIS), Home Depot (HD), Coca-Cola (KO), and NVIDIA (NVDA).

3.1 Retrieve Price Data

The daily adjusted closing price of the tickers can be retrieved using quantmod::getSymbols().

# Vector of tickers to include in portfolio
tickers <- c("PG", "WMT", "BKNG", "CRM", "MMM", "SBUX", "DIS", "HD", "KO", "NVDA")

startdate <- as.Date("2015-01-01")
enddate <- as.Date("2022-07-01")

price_data <- NULL

# Loop to get adjusted closing prices for all stocks
for(t in tickers) {
  price_data <- cbind(price_data,
                      quantmod::getSymbols(Symbols = t, src = "yahoo", auto.assign = FALSE,
                                           from = startdate, to = enddate, periodicity = "daily") %>% Ad())
}

# Check dimension of object, start and end date of data collected
dim(price_data); start(price_data); end(price_data)

## [1] 1887   10

## [1] "2015-01-02"

## [1] "2022-06-30"

# See first 6 observations in price_data
data.frame(head(price_data))

3.2 Stock Returns for Portfolio

Discrete returns can be calculated using PerformanceAnalytics::Return.calculate().

return_data <- na.omit(PerformanceAnalytics::Return.calculate(prices = price_data, method = "discrete")) %>%
  `colnames<-`(paste("R", tickers, sep = "_"))

dim(return_data); data.frame(head(return_data))

## [1] 1886   10

4 Measuring Spread of Data

4.1 Mean and Median

The (arithmetic) mean and median are two measures of central tendency that can be used to describe expected returns of a stock.

Since the distribution of stock returns tend to be non-normal, the median may be a more appropriate measure. For example, the density plot of PG shows that the returns do not follow a normal distribution. The Q-Q plot on the top-left corner also indicated a non-normal distribution.

chart.Histogram(R = return_data$R_PG, 
                method = c("add.density", "add.normal", "add.qqplot"), 
                main = "Density Plot of PG Historical Returns")

legend(x = "topright", legend = c("Density Plot", "Normal Distribution"), lwd = 2, col = c("darkblue", "blue"))

The mean and median returns of each stock in the portfolio are:

stock_means <- apply(X = return_data, MARGIN = 2, FUN = mean)

stock_medians <- apply(X = return_data, MARGIN = 2, FUN = median)

data.frame(rbind(Mean = stock_means, Median = stock_medians))

Although the mean and median return values are quite small, we can notice a difference between the two measures.

4.2 Variance (or Standard Deviation)

The variance measures the squared deviation of returns from the mean, but standard deviation (SD) is used since it is in the same units as returns.

The SD of each stock in the portfolio is:

stock_sd <- apply(X = return_data, MARGIN = 2, FUN = sd)

data.frame(rbind(SD = stock_sd))

If returns were normally distributed, we can use the 68-95-99 rule, where 68%/95%/99% of returns are within 1SD/2SD/3SD of the mean.

4.3 Mean Absolute Deviation

The mean absolute deviation (MAD) is the mean of the absolute deviations between returns and a central point. The central point usually refers to the mean, but the median can be used as well.

\(MAD = \frac{1}{N}\sum_{i=1}^N |R_i - m(R)|\), where \(R_i\) is the return of a stock, and \(m(R)\) is the mean or median return of the stock.

The MAD around the mean and median of each stock in the portfolio are:

stock_MAD <- apply(X = return_data, MARGIN = 2, FUN = function(x) {
  mean(abs(x - mean(x)))
})

stock_MADmed <- apply(X = return_data, MARGIN = 2, FUN = function(x) {
  mean(abs(x - median(x)))
})

data.frame(rbind(MAD_mean = stock_MAD, MAD_median = stock_MADmed))

We can see that the MAD of each stock is smaller than its SD as SD places more weight on outliers than MAD. The MAD calculated with the mean and median are quite similar. I would stick with using MAD around the mean as a measure of risk.

4.4 Median Absolute Deviation

The median absolute deviation (also abbreviated as MAD, but for the purpose of distinguishing the two measures, I used MeAD instead) is the median of the absolute deviations between returns and its median. It is seen as a more robust measure of variability than MAD or SD.

\(MeAD = med |R_i - med(R)|\), where \(R_i\) is the return of a stock, and \(med(R)\) is the median return of the stock.

The MeAD of each stock in the portfolio is:

stock_MeAD <- apply(X = return_data, MARGIN = 2, FUN = function(x) {
  median(abs(x - median(x)))
})

data.frame(rbind(MeAD = stock_MeAD))

4.5 Summary

The different return and risk measures based on mean and median are summarized below:

data.frame(rbind(Mean = stock_means, Median = stock_medians, 
                 SD = stock_sd, MAD_mean = stock_MAD, MAD_median = stock_MADmed, MeAD = stock_MeAD))

Portfolio optimization using mean returns and variance/SD and MAD as risk measures are common and a simple Google search would return many research papers and articles on it. Using median returns instead of mean returns have also been widely researched. However, the use of MeAD as a risk measure in portfolio optimization does not seem to be documented. The project tested MeAD as part of the research, but it may be a spurious measure of risk as no statistical tests or simulations to find its significance were carried out.

5 Random Portfolios

5.1 Generate Random Portfolios

Optimization based on median measures are usually not implemented by packages and functions, so I had to use a set of random hypothetical portfolios in this project.

Before optimizing any objectives, I generated a set of random portfolios which satisfy constraints where the sum of the component weights must be equal to 1 and the weight of each component is between 0% and 100% of the portfolio.

portspec <- PortfolioAnalytics::portfolio.spec(assets = tickers)

# Sum of weights constrained to 1, can also specify as type = "full investment"
portspec <- PortfolioAnalytics::add.constraint(portfolio = portspec,
                                               type = "weight_sum",
                                               min_sum = 1, max_sum = 1)

# Weight of each portfolio component can vary between minimum of 0% and maximum of 100%
portspec <- PortfolioAnalytics::add.constraint(portfolio = portspec,
                                               type="box", 
                                               min = 0, max = 1)

portspec

## **************************************************
## PortfolioAnalytics Portfolio Specification 
## **************************************************
## 
## Call:
## PortfolioAnalytics::portfolio.spec(assets = tickers)
## 
## Number of assets: 10 
## Asset Names
##  [1] "PG"   "WMT"  "BKNG" "CRM"  "MMM"  "SBUX" "DIS"  "HD"   "KO"   "NVDA"
## 
## Constraints
## Enabled constraint types
##      - weight_sum 
##      - box (long only)

set.seed(43594)

rand_port <- PortfolioAnalytics::random_portfolios(portfolio = portspec, 
                                                   permutations = 50000, 
                                                   rp_method = "sample", 
                                                   eliminate = TRUE)

dim(rand_port); head(rand_port)

## [1] 4341   10

##         PG   WMT  BKNG   CRM   MMM  SBUX   DIS    HD    KO  NVDA
## [1,] 0.100 0.100 0.100 0.100 0.100 0.100 0.100 0.100 0.100 0.100
## [2,] 0.090 0.380 0.026 0.008 0.084 0.000 0.052 0.164 0.174 0.022
## [3,] 0.034 0.074 0.120 0.190 0.260 0.010 0.102 0.012 0.172 0.026
## [4,] 0.154 0.002 0.028 0.216 0.152 0.016 0.216 0.032 0.000 0.184
## [5,] 0.104 0.000 0.000 0.378 0.000 0.006 0.500 0.004 0.000 0.008
## [6,] 0.046 0.034 0.100 0.018 0.560 0.154 0.020 0.006 0.062 0.000

4,341 portfolio permutations were found, which will be used for the rest of this project.

5.2 Optimization Strategy

I used two different optimization objectives in this project:

Maximize return
Minimize risk

To make the different strategies more practical, I attempted to replicate a half-yearly re-optimization strategy using previous one year data. In this case, I would calculate the return and/or risk of the hypothetical portfolios with 2015 return data and implement the optimal weights on 2016H1. Then, I re-optimize the weights using data from 2015H2 to 2016H1 for 2016H2 and so on. The last re-optimization used 2021 data for 2022H1 since I only retrieved data up to 30 June 2022.

opt_periods <- c("2015", "2015-07/2016-06", 
                 "2016", "2016-07/2017-06", 
                 "2017", "2017-07/2018-06", 
                 "2018", "2018-07/2019-06", 
                 "2019", "2019-07/2020-06", 
                 "2020", "2020-07/2021-06", 
                 "2021")

ret_periods <- c("2016-01/2016-06", "2016-07/2016-12", 
                 "2017-01/2017-06", "2017-07/2017-12",
                 "2018-01/2018-06", "2018-07/2018-12",
                 "2019-01/2019-06", "2019-07/2019-12",
                 "2020-01/2020-06", "2020-07/2020-12",
                 "2021-01/2021-06", "2021-07/2021-12",
                 "2022-01/2022-06")

data.frame(cbind(Optimization_Period = opt_periods, Return_Period = ret_periods))

5.3 Daily Portfolio Returns

The daily returns of hypothetical portfolios can be calculated using opt_periods and randport and is based on the formula \(R_p = \sum_{i=1}^N R_i w_i\). I used geometric chaining to aggregate returns and rebalanced the portfolios quarterly. We can use the performance of random portfolios in each optimization period to select portfolio weights which optimized that period’s return and/or risk.

rp_returns <- foreach(i = 1:nrow(rand_port), .combine = "cbind") %do% {
  tmp <- PerformanceAnalytics::Return.portfolio(R = return_data,
                                                weights = rand_port[i, ],
                                                geometric = TRUE,
                                                rebalance_on = "quarters")
}

I also calculated the returns of an equal-weight portfolio to compare the performance of optimized portfolios.

# If do not include weights, equal weight portfolio is assumed
ewp_return <- PerformanceAnalytics::Return.portfolio(R = return_data,
                                                     geometric = TRUE,
                                                     rebalance_on = "quarters")

6 Best Return Portfolios

In this section, I find portfolios that maximize mean and median returns in each optimization period, although these types of objectives may not be practical in portfolio optimization. It assumes that investors create their portfolios based on the best historical returns. However, these strategies can give an idea of the risks taken by an investor, based on the drawdown of the portfolios.

6.1 Maximize Mean Return

Find weights that maximizes mean portfolio returns in each optimization period:

maxmean_weight <- foreach(i = opt_periods, .combine = "rbind") %do% {
  tmp <- PerformanceAnalytics::Mean.arithmetic(x = rp_returns[i, ])
  
  opt_weight <- rand_port[which.max(tmp), ]
}

rownames(maxmean_weight) <- paste("OP", 1:nrow(maxmean_weight), sep = "")

data.frame(maxmean_weight)

Calculate return based on the selected portfolio weights in the return period:

# Returns of best mean portfolio in ret_period using weights from opt_period
maxmean_returns <- foreach(i = ret_periods, j = 1:nrow(maxmean_weight), .combine = "rbind") %do% {
  tmp <- PerformanceAnalytics::Return.portfolio(R = return_data[i, ], 
                                                weights = maxmean_weight[j, ], 
                                                geometric = TRUE, 
                                                rebalance_on = "quarters")
}

6.2 Maximize Median Return

Find weights that maximizes median portfolio returns in each optimization period:

# Find weights that maximizes median of each optimization period
maxmed_weight <- foreach(i = opt_periods, .combine = "rbind") %do% {
  tmp <- apply(X = rp_returns[i, ], MARGIN = 2, FUN = median)
  
  opt_weight <- rand_port[which.max(tmp), ]
}

rownames(maxmed_weight) <- paste("OP", 1:nrow(maxmed_weight), sep = "")

data.frame(maxmed_weight)

Calculate return based on the selected portfolio weights in the return period:

# Returns of best median portfolio in ret_period using weights from opt_period
maxmed_returns <- foreach(i = ret_periods, j = 1:nrow(maxmed_weight), .combine = "rbind") %do% {
  tmp <- PerformanceAnalytics::Return.portfolio(R = return_data[i, ], 
                                                weights = maxmed_weight[j, ], 
                                                geometric = TRUE, 
                                                rebalance_on = "quarters")
}

6.3 Comparison of Best Return Portfolios

Plot weights of best mean and best median portfolios:

chart.StackedBar(w = maxmean_weight, colorset = RColorBrewer::brewer.pal(n = 10, "Spectral"),
                 main = "Optimal Weights of Best Mean Portfolio", ylab = "Weight")

chart.StackedBar(w = maxmed_weight, colorset = RColorBrewer::brewer.pal(n = 10, "Spectral"),
                 main = "Optimal Weights of Best Median Portfolio", ylab = "Weight")

Plot cumulative return of best mean and best median portfolios against equal-weight portfolio return:

best_return <- cbind(maxmean_returns, maxmed_returns, ewp_return["2016/",]) %>%
  `colnames<-`(c("Best_Mean", "Best_Median", "Equal Weight"))

chart.CumReturns(R = best_return, geometric = TRUE,
                 legend.loc = "topleft",
                 main = "Cumulative Return of Best Return Portfolios")

chart.Drawdown(R = best_return, geometric = TRUE,
               legend.loc = "bottomleft",
               main = "Drawdown of Best Return Portfolios")

Tables of annualized returns and risk measures:

table.AnnualizedReturns(R = best_return, scale = 252, Rf = 0.03/252, geometric = TRUE)

table.DownsideRisk(R = best_return, scale = 252, Rf = 0.03/252, MAR = 0.08/252)

6.4 Summary

The Best Median portfolio had a lower annualized return and standard deviation than the Best Mean portfolio. Furthermore, it resulted in better diversification than the Best Mean portfolio, which was more concentrated in a single stock (mainly in NVDA). Therefore, using the median return in optimization may allow an investor to achieve a lower risk than using the mean return.

7 Minimum Risk Portfolios

On the opposite spectrum, I find portfolios that minimizes risk in each optimization period in this section.

7.1 Minimize Variance

Find weights that minimizes variance/standard deviation in each optimization period:

minstd_weight <- foreach(i = opt_periods, .combine = "rbind") %do% {
  tmp <- PerformanceAnalytics::StdDev(R = rp_returns[i, ])
  
  opt_weight <- rand_port[which.min(tmp), ]
}

rownames(minstd_weight) <- paste("OP", 1:nrow(minstd_weight), sep = "")

data.frame(minstd_weight)

Calculate return based on the selected portfolio weights in the return period:

minstd_returns <- foreach(i = ret_periods, j = 1:nrow(minstd_weight), .combine = "rbind") %do% {
  tmp <- PerformanceAnalytics::Return.portfolio(R = return_data[i, ], 
                                                weights = minstd_weight[j, ], 
                                                geometric = TRUE, 
                                                rebalance_on = "quarters")
}

7.2 Minimize MAD

Find weights that minimizes MAD in each optimization period:

minmad_weight <- foreach(i = opt_periods, .combine = "rbind") %do% {
  tmp <- apply(X = rp_returns[i, ], MARGIN = 2, FUN = function(x) {
    mean(abs(x - mean(x)))
    })
  
  opt_weight <- rand_port[which.min(tmp), ]
}

rownames(minmad_weight) <- paste("OP", 1:nrow(minmad_weight), sep = "")

data.frame(minmad_weight)

Calculate return based on the selected portfolio weights in the return period:

minmad_returns <- foreach(i = ret_periods, j = 1:nrow(minmad_weight), .combine = "rbind") %do% {
  tmp <- PerformanceAnalytics::Return.portfolio(R = return_data[i, ], 
                                                weights = minmad_weight[j, ], 
                                                geometric = TRUE, 
                                                rebalance_on = "quarters")
}

7.3 Minimize MeAD

Find weights that minimizes MeAD in each optimization period:

minmead_weight <- foreach(i = opt_periods, .combine = "rbind") %do% {
  tmp <- apply(X = rp_returns[i, ], MARGIN = 2, FUN = function(x) {
    median(abs(x - median(x)))
    })
  
  opt_weight <- rand_port[which.min(tmp), ]
}

rownames(minmead_weight) <- paste("OP", 1:nrow(minmead_weight), sep = "")

data.frame(minmead_weight)

Calculate return based on the selected portfolio weights in the return period:

minmead_returns <- foreach(i = ret_periods, j = 1:nrow(minmead_weight), .combine = "rbind") %do% {
  tmp <- PerformanceAnalytics::Return.portfolio(R = return_data[i, ], 
                                                weights = minmead_weight[j, ], 
                                                geometric = TRUE, 
                                                rebalance_on = "quarters")
}

7.4 Comparison of Minimum Risk Portfolios

Plot weights of minimum risk portfolios:

chart.StackedBar(w = minstd_weight, colorset = RColorBrewer::brewer.pal(n = 10, "Spectral"),
                 main = "Optimal Weights of Minimum Variance Portfolio", ylab = "Weight")

chart.StackedBar(w = minmad_weight, colorset = RColorBrewer::brewer.pal(n = 10, "Spectral"),
                 main = "Optimal Weights of Minimum MAD Portfolio", ylab = "Weight")

chart.StackedBar(w = minmead_weight, colorset = RColorBrewer::brewer.pal(n = 10, "Spectral"),
                 main = "Optimal Weights of Minimum MeAD Portfolio", ylab = "Weight")

Plot cumulative return of minimum risk portfolios against equal-weight portfolio return:

min_risk <- cbind(minstd_returns, minmad_returns, minmead_returns, ewp_return["2016/",]) %>%
  `colnames<-`(c("Min_Var", "Min_MAD", "Min_MeAD", "Equal Weight"))

chart.CumReturns(R = min_risk, geometric = TRUE,
                 legend.loc = "topleft",
                 main = "Cumulative Return of Minimum Risk Portfolios")

chart.Drawdown(R = min_risk, geometric = TRUE,
               legend.loc = "bottomleft",
               main = "Drawdown of Minimum Risk Portfolios")

Tables of annualized returns, risk measures and statistics:

table.AnnualizedReturns(R = min_risk, scale = 252, Rf = 0.03/252, geometric = TRUE)

table.DownsideRisk(R = min_risk, scale = 252, Rf = 0.03/252, MAR = 0.08/252)

7.5 Summary

The equal-weight portfolio had the highest annualized return and standard deviation, compared to the minimum risk portfolios. While the Minimum MeAD portfolio performed the best among the minimum risk portfolios, it cannot be concluded that the MeAD should be used extensively in practice, given the lack of testing.

8 Conclusion

The results of this project would suggest that portfolios maximizing the median returns would lead to better diversification than mean returns. Furthermore, minimizing the median absolute deviation resulted in a portfolio that has the highest return among portfolios minimizing the different risk measures.

It should be stressed that the results are not generalizable as there were only 10 stocks chosen for this project and the random portfolios generated were just a small subset of an astronomical number of possible permutations. However, the results do coincide with this paper, which was able to find that median models provided benefits of portfolio diversification and returns. This shows that the median measure is useful in certain scenarios, and should be considered in portfolio optimization.

References

Frost, J. Mean Absolute Deviation: Definition, Finding & Formula. Statistics By Jim. Retrieved 28 July 2022, from https://statisticsbyjim.com/basics/mean-absolute-deviation/

Wikipedia. (2022). Average absolute deviation. Retrieved 29 July 2022, from https://en.wikipedia.org/wiki/Average_absolute_deviation

Wikipedia. (2022). Median absolute deviation. Retrieved 28 July 2022, from https://en.wikipedia.org/wiki/Median_absolute_deviation

Returns and Risks: The Mean and The Median

Tan Zheng Liang

2022-08-21

1 Introduction

2 Packages Required

3 Retrieve Stock Prices and Calculate Returns

3.1 Retrieve Price Data

3.2 Stock Returns for Portfolio

4 Measuring Spread of Data

4.1 Mean and Median

4.2 Variance (or Standard Deviation)

4.3 Mean Absolute Deviation

4.4 Median Absolute Deviation

4.5 Summary

5 Random Portfolios

5.1 Generate Random Portfolios

5.2 Optimization Strategy

5.3 Daily Portfolio Returns

6 Best Return Portfolios

6.1 Maximize Mean Return

6.2 Maximize Median Return

6.3 Comparison of Best Return Portfolios

6.4 Summary

7 Minimum Risk Portfolios

7.1 Minimize Variance

7.2 Minimize MAD

7.3 Minimize MeAD

7.4 Comparison of Minimum Risk Portfolios

7.5 Summary

8 Conclusion

References